********************************************
* Regulation-induced pollution substitution
* Review of Economics and Statistics
* Replication package
* Matthew Gibson - mg17@williams.edu
********************************************

**************************
* Data & folder structure
**************************
* File size limits prevent me from putting the entire package into a single .tar archive with the required directory structure. The recommended structure is outlined here.

* Beneath the working directory, the following subdirectories are required. Unzipping the file "code.tar.gz" in your working directory should create all of this structure except the Data directory, which you will have to create manually.
Code - Scripts should execute from anywhere, but placing them in this folder is recommended.
Data - See below for detail on subdirectories.
Graphs - Initially empty. Scripts will populate.
Logs - Initially empty. Scripts will populate.
Tables - Initially empty. Scripts will populate.

* Beneath the /Data directory, the following subdirectories are required. Extracting all data .tar files in the /Data directory will create the necessary structure. In most cases one .tar archive is provided for each subdirectory. The file "other-data.tar.gz" contains a few smaller files and required empty subdirectories.
AQS
CBP
DMR
ECHO
FIPS
Greenstone
Masters
NEI
Nonattainment
Tox_weights
TRI
In some cases these subdirectories contain zipped data files (usually CSV format) that must be unzipped.


*******************
* Scripts
*******************
* These scripts were executed under Stata versions 13 and 14, on Ubuntu Linux
* In all scripts, set the work local to the directory in which you are working by replacing "PATH".
* Scripts should be executed in the listed order. To run the entire contents of a script, all flow control locals should be set equal to 1. Setting a flow control local to 0 will bypass the corresponding code block.

* The following user-written Stata commands (available from SSC) are required.
esttab/estout - LaTeX table output.
geodist - Computes distances between lat-long pairs.

* Data processing
Dataproc_TRI.do - Processes EPA Toxic Release Inventory data.
Dataproc_AQS.do - Processes EPA air quality monitor data.
Dataproc_CBP.do - Processes Census County Business Patterns data.
Dataproc_DMR.do - Processes merged DMR-TRI data from EPA pollutant loading tool.
Dataproc_ECHO.do - Processes EPA inspection data from ECHO: Enforcement and Compliance History Online.
Dataproc_Greenbk.do - Processes EPA "Green Book" county non-attainment data.
Dataproc_cnty_merge.do - i) Subsets EPA county nonattainment data. ii) Saves file of nonattainment monitor coordinates. iii) Merges outputs from i and ii into a single county-year data set.
Dataproc_TRI_cnty_merge.do - Merges coordinates of nonattainment monitors onto TRI data by county, computes minimum distances.
Dataproc_TRI_facility_chem_yr.do - Creates alternate TRI data set at facility-year-chemical level.
Dataproc_NEI_bychem.do - Processes EPA National Emissions Inventory data.

* Analysis
Analysis_treatment.do - Estimates effects of CAA non-attainment on air emissions, spatially disaggregated within non-attainment counties.
Dataproc_intrafirm.do - Based on air emissions analysis in previous script, creates treatment variables for intrafirm leakage analysis.
Analysis_chem_panel.do - Analysis of cross-media substitution within chemical, for appendix table.
Analysis_DMR.do - Compares TRI water emissions to Discharge Monitoring Reports (DMRs)
Analysis_exogeneity.do - Tests for relationships between distance to nearest violating monitor and air emissions (levels and changes).
Analysis_intrafirm.do - Estimates regression models of intrafirm spatial air emissions leakage.
Analysis_NEI.do - Compares TRI air emissions to National Emissions Inventory (NEI).
Analysis_placebo.do - Placebo test of cross-media substitution models.
Analysis_TRI_CBP_compare.do - Compares TRI coverage to County Business Patterns. This is appendix material only.
Analysis_xmedia.do - Estimates regression models of cross-media pollution substitution.
Analysis_descriptive_stats.do - TRI descriptive statistics for Appendix.



